Skip to main content
Version: V11

Configuring VIDIZMO Indexer for PII Detection and Redaction

The VIDIZMO Indexer provides detection and redaction capabilities using AI models. One key capability is detecting and redacting Personally identifiable information (PII) from transcribed audio and video files.

In addition, the VIDIZMO Indexer also supports visual PII detection through optical character recognition (OCR), allowing you to detect PII in documents, images, and videos that contain text. This enhancement provides an alternative method for PII detection, especially useful for videos that contain on-screen text but no audio for transcription.

If you want to learn more about this functionality, visit Understanding PII Detection and Redaction using VIDIZMO Indexer.

Note: If you are performing PII detection via transcription, you need your audio or video files to be transcribed. The VIDIZMO Indexer app will automatically generate transcriptions for your content if you have configured it for PII detection. You can also add transcriptions in the following ways:

Pre-requisites

To use this feature, make sure the following are true:

  • You have feature permissions for App Management.
  • You have feature permissions for PII Detection and Redaction.
  1. Use the button at the top left to open the action menu.
  2. Click the Admin dropdown.
  3. Select Portal Settings.

  1. While in Portal Settings, click Apps.
  2. Select Content Processing.
  3. Use the settings icon to open the VIDIZMO Indexer's settings.

Configuring the VIDIZMO Indexer for PII

The following screen allows you to configure how the Indexer App handles Personally Identifiable Information (PII). Each option controls how PII is detected, processed, and redacted in the portal.

  1. Media Formats: Select the file formats you want to detect PII in.

    Note: In the EVCM Portal, this field appears as Media formats. In the DEM Portal, it appears as Evidence formats.

  2. Insights: Select the PII entities you want to detect in your content (For the full list, see PII Entities). You can also create your own PII entities and include them in detection by selecting Custom PII (see How to Create Custom Patterns for details).

Note: You can also add other AI insights (such as Chaptering) to generate them alongside PII detection or redaction.

  1. PII Detection

  • Confidence Score: Set the confidence score for the model to recognize a detected term as a PII entity. A term is classified as PII only if its confidence exceeds this minimum value. The confidence score can be set from 0-100. Recommended: 35.
  • Excluded Words: Provide a list of words that will not be detected as PII or not be marked as PII even if they are already defined as PII within the application. For example, 'John' is considered a PII entity, but adding 'John' to ‘Excluded Words' will prevent the application from identifying it as PII. Note that this field is case-sensitive.
  • Context Keywords: Provide the context words that the model will analyze to improve the confidence score for the custom PII entity.

  1. Automatic Object Redaction Settings

  • Redaction Types: Choose the PII entities you wish to redact from your audio, video, or documents by selecting them from the drop-down menu.
  • Confidence Threshold for Redaction: Set the minimum confidence level (10–99) that the model must reach to identify a term as PII. Only terms with a confidence score above this value are automatically redacted.
  • Audio Redaction Type: Select the Audio Redaction Type to determine how PII is handled in the audio. The selected redaction type replaces the detected PII with a specific sound.
    • Bleep: Replaces the PII with a short bleep sound.Available only for .wav files.
    • Mute: Silences the section containing the PII. Used by default for all other audio file formats.

You can change these settings later in the Process Modal or Studio Space when performing automatic PII redaction.

  1. Advanced Processing

  • Action for original file: Select how you want to handle the content that undergoes PII detection or redaction activity after processing finishes. This setting only applies when Automatic Processing is on.

    1. Retain File: The original content remains unaffected, while a copy is created, processed for PII insights, and then published.  
    2. Delete and Move to Recycle Bin: The original content is deleted and moved to the Recycle Bin, while a copy of the content is processed for PII insights, and then published.
    3. Override Original File: The original content is processed for PII insights while it remains published. No copies are made, and the content isn’t relocated.
  • Time Interval Threshold: Specify the time interval threshold (in milliseconds) for more accurate redaction of PII. The threshold value determines which PII detections are selected during audio activity detection, and then applies the configured Start Time and End Time corrections to the audio segments for complete redaction.

  • Start Time Correction: Specify a value (in milliseconds) for how much the start time of a PII detection activity will be adjusted. This setting is helpful when the beginning part of a PII is skipped during detection. For instance, if "John" is left out from the PII "John Doe," the start time correction can help capture the entire PII for complete classification.

  • End Time Correction: Specify a value (in milliseconds) for how much the end time of a PII detection activity will be adjusted. This setting is helpful when the ending part of a PII is skipped during detection. For instance, if "Doe" is left out from the PII "John Doe," the end time correction can help in capturing the entire PII for complete classification. 

  1. Automatic Processing: Turn on automatic processing to redact PII as soon as content is uploaded in to your Portal.
  2. Click Save Changes to apply your settings.

  1. Make sure the app is turned on so you can process content with these settings.

For instructions on detecting and redacting PII, see How to Detect and Redact PII using VIDIZMO Indexer.